A graph theoretical analysis of the energy landscape of model polymers 



o 
o 



(N 



Marco Baiesi/ Lorenzo Bongini,^'0 Lapo Casetti/'^ and Lorenzo Tattini'^ 

^ Dipartimento di Fisica and Centra interdipartimentale per lo Studio delle Dinamiche Complesse (CSDC), 
Universita di Firenze, via G. Sansone 1, 50019 Sesto Fiorentino, Ital^ 
^Istituto Nazionale di Fisica Nucleare (INFN), sezione di Firenze, 
via G. Sansone 1, 50019 Sesto Fiorentino, Italy 
Dipartimento di Chimica, Universita di Firenze, 
via della Lastruccia 3, 50019 Sesto Fiorentino, Italy 

In systems characterized by a rough potential energy landscape, local energetic minima and 
saddles define a network of metastable states whose topology strongly influences the dynamics. 
Changes in temperature, causing the merging and splitting of metastable states, have non trivial 
effects on such networks and must be taken into account. We do this by means of a recently 
proposed renormalization procedure. This method is applied to analyze the topology of the network 
of metastable states for different polypeptidic sequences in a minimalistic polymer model. A smaller 
spectral dimension emerges as a hallmark of stability of the global energy minimum and highlights 
a non-obvious link between dynamic and thermodynamic properties. 
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I. INTRODUCTION 



The dynamical behavior of polymers in the collapsed 
phase has been a field of active research since the con- 
cepts of solvent and collapse transition were introduced 
by Flory in the early fifties of the XX century [ij. In 
the last decades many statistical mechanical analyses of 
polymer models have been carried out Q , studying con- 
figurational plasticity and dynamics, and their depen- 
dence on the microscopic details of the physical inter- 
actions between monomers and with the solvent. How- 
ever, the inherent frustration of these models often hin- 
ders analytical approaches, and even the investigations 
of very simple polymer models still have to be under- 
taken on a numerical basis. This fundamental difficulty 
also affects the study of an important class of heteropoly- 
mers, namely proteins. According to their sequence these 
molecules fold into a unique tridimensional structure, the 
native configuration, which corresponds to the minimum 
of the potential energy. Natural proteins show a remark- 
able variety of folding behaviors. Folding times might 
span a couple of orders of magnitude even for proteins 
of comparable size [3j] and folding kinetics of very similar 
proteins might change from two-stage to more complex, 
multistage transition schemes j^. Moreover, in the last 
twenty years, the development of combinatorial methods 
of protein synthesis has allowed the experimental veri- 
fication of an old paradigm of protein science: random 
sequence polypeptides very rarely fold It is, however, 
still not clear how exactly proteins differ from random 
hctcropolymers and how these in turn differ from a ho- 
mopolymer. 
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Minimalistic models might be useful in attacking such 
basic questions because they can be investigated more 
easily than complex and more realistic models. Since the 
proposal of the lattice HP model f6l| several protein mod- 
els have proven capable of qualitatively reproducing the 
main thermodynamic features of the folding process, such 
as the folding and 0-transition temperatures, while tak- 
ing into account the effect of different primary sequences 
01 . Most of these models exhibit a funneled energy land- 
scape characterized by the presence of many competing 
local minima of the potential energy Q, in strict anal- 
ogy with structural glasses [1, [l^ . In this framework 
the folding properties of a given protein sequence are of- 
ten depicted as arising from the interplay between fun- 
nel steepness (the global bias toward the native config- 
uration) and the landscape roughness (the number and 
average depth of energy minima). More precisely each 
protein is characterized by a well defined temperature 
range in which it manages to attain its native structure. 
When this range is sufficiently large the corresponding 
sequence can be defined a "good folder" . The same def- 
inition is often applied also to fast folding sequences, in 
analogy to the short folding times that generally char- 
acterize real proteins. The relations between the folding 
properties of proteins and the topography of their energy 
landscape have been investigated at length leading to the 
evidence of complex kinetic behaviors [ill, 113 ^'^'^ ^o the 
proposal of several criteria for the identification of fast 
folders using equilibrium indicators [H, [l3| ■ 

A very promising method for analyzing the topogra- 
phy of energy landscapes is representing them as a net- 
work. Dynamic trajectories on rough energy landscapes 
typically exhibit a separation of time scales in the sam- 
pling of the available configurational space. The first- 
order saddles of the potential that connect the basins of 
attraction of different minima also represent kinetic bot- 
tlenecks for the system and usually induce a partitioning 
of the configuration space in a finite set of metastable 
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states, each characterized by a large escape time and a 
fast local difFusivity. In this paper we will stress that a 
metastable state in a system at finite temperature can 
either correspond to the basin of attraction of a single 
minimum of the potential energy or, more generally, to a 
collection of different minima linked by low-energy sad- 
dles. Under these conditions the long-term dynamics of 
the system consists of a series of activated jumps between 
different metastable states. Trajectories can therefore be 
described by a master equation which depends only on 
the transition rates between different metastable states 
[TH - In this context the investigation of the statistical 
and topological properties of the graph representing the 
network of connections between metastable states (NMS) 
provides a tool for describing the structural organization 
of the landscape and its influence on the folding dynam- 
ics. 

Preliminary work on a two-dimensional toy model [l6j 
has shown that there are quantitative differences in the 
topography of the energy landscape of heteropolymers 
and homopolymers. Differences are found also between 
fast folding and slow folding heteropolymers. However, 
although the model used in [l^ correctly shows all the 
distinctive thermodynamic phases expressed by random 
heteropolymers and proteins, its two-dimensional char- 
acter raises serious concerns about its capability of re- 
producing the conformational flexibility of real polymers. 
One of the goals of this work is, therefore, to explore the 
topography of the energy landscape of different polymers 
and heteropolymers using a more realistic, although still 
minimalistic, representation. 

In this paper we investigate the topology of the NMS 
of the energy landscape of different sequences in a mini- 
malistic three-dimensional off-lattice polymer model. In 
Sec. im and Sec. IIIII we describe the model and the tech- 
nique employed to sample the relevant fixed points of 
the energy landscape. In Sec. IIVI we present a renor- 
malization procedure suitable to group the basins of at- 
traction of the existing minima of the potential energy 
into temperature dependent metastable states. In Sec. IVl 
we briefiy discuss the thermodynamic properties of the 
systems while in the following sections we analyze the 
topology of their NMS. More precisely, in Sec. I VII we de- 
tail the change in topology for increasing chain length in 
hydrophobic homopolymers and in Sec. IVIII we investi- 
gate the topological differences between the NMS of two 
heteropolymeric sequences characterized by very differ- 
ent stabilities of the native structure. 



II. THE MODEL 

We consider an off-lattice coarse-grained model for 
short peptides that has been recently studied by Clementi 
and coworkers [13, [H, Ull • The model describes a linear 
molecule with N residues, each one representing the po- 
sition Xi of a Cq, atom. The bond vectors fi are Xi^i — Xi, 
with length = \ri\. Following the notation of Ref. p^ . 



we define also the distances r^.j = \xi — Xj\. An an- 
gle between subsequent bonds and r^+i is denoted by 
6i, while ipi is the dihedral angle formed by the residues 
i,i + l,i + 2,i + 3. The potential energy has already been 
discussed in appendices of Refs. [17|, ilJi]. It is composed 
of the following contributions: 



V = ybond + Kng + Vdih + Vlj . 

One has a bond term 

N-l 



14ond = A:r^(r, -rf^)2, 
an angular term 

Vang ^hY^ 



(1) 



(2) 



i=l 



N-2 



(3) 



a dihedral term 



ih 



E 

i=l 



fcW(l -cos(^,- )))-!- 



fc(2)(i_cos3((p,-^f )) 



(4) 



and a Lennard-Jones term 
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(5) 



The notation refers to a sum over pairs z, j with j — 
i > 4, i.e., over pairs of residues that are not consecutive 
to each other on the chain. The choice of the contact map 
C, of the constants r -"^ and of the distances 

CTi J , determines the kind of the peptide; the values chosen 
for these constants as well as for the remaining ones are 
reported below. 

In this paper we aim at understanding relations be- 
tween the topography of the energy landscape of a poly- 
mer and the topology of its NMS. We also investigate 
how these features are influenced by the conformational 
stability and the size of the system. We have therefore 
analyzed three cases: two heteropolymers with = 12 
monomers, characterized by 

(a) a highly stable native configuration (which from 
now on we will call the stable folder) 

(b) a highly unstable native configuration (which from 
now on we will call the unstable folder) 

and 

(c) a hydrophobic homopolymer with 4 < iV < 12. 
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The latter has been considered because it represents a 
system amenable to be studied at different sizes with- 
out introducing non-obvious sequence dependent effect, 
as one would do with heteropolymers. As a stable folder 
we chose a a-helix studied in |19|]. Analogously to real 
helices, this system is characterized by a strong energetic 
bias toward the native configuration, granted by a Go- 
like potential with Cij = 1 only for j = i-f4 that mimics 
the hydrogen bonding responsible for helix stabilization. 
On the contrary the unstable sequence was chosen as the 
portion from residue i = 8 to residue 17 that forms a 
beta-sheet in the model considered in ■ This segment 
is stabilized by interactions with residues not included in 
our selection and is therefore highly unstable and sub- 
stantially unstructured when isolated. 

Finally for homopolymers we used C^j — 1 (still with 
j > i+4) to induce a generic attraction between residues, 
and parameters have been homogeneously fixed for all 
residues to rf^ = 3.8, of^ = 1.6, ^ = 0.88, cr^j = 
6.30, ao = 3, ei = 10, 62 = 5ei. In all cases (a)-(c), 
the dihedral potential has a main minimum and two side 
minima and the values of the constants fc's in the various 
contribution to the potential are proportional to those 
of Ref. [H: k, = 418.41, kg = 83.682, k^^^ = 4.1841, 
fci?^ = 2.092. 



III. SAMPLING THE LANDSCAPE 

The energy landscape of each polymer has been ex- 
plored by means of a new take on the activation relax- 
ation technique (ART) [13, [SI. The original ART is a 
method to jump from a minimum of the potential energy 
to a neighboring one, and so on with a sort of random 
walk, until the space of minima has been satisfactorily 
explored. Each local minimum is characterized by a pos- 
itive curvature of the energy function in all directions. 
The matrix with the second partial derivatives of the po- 
tential energy is the Hessian Ti, and at a minimum it has 
all non-negative eigenvalues. The escape from a mini- 
mum with the ART goes via an activation that attempts 
to find first a nearby saddle in the energy landscape. This 
is done by slowly forcing the chain in a random direction 
of the 3A''-dimensional configuration space until a nega- 
tive eigenvalue of 7i arises. The direction of the negative 
eigenvalue is then followed until the force vanishes, which 
indicates that a saddle point of the energy function has 
been reached. By a gentle push to the other side of the 
saddle and with a subsequent minimization, eventually 
a new minimum is reached. The new found minimum 
is then added to the catalog of minima if not already 
present. The same is done for the saddle with a separate 
catalog. In our case, configurations of the newly recov- 
ered minima and saddles are compared to the already 
recorder ones by means of a contact distance analogous 
to that described in p^ . 

In our model, as in other polymer models where ART 



has been used [12, [13, [13| , one cannot deform a configu- 
ration at random during activation, because the action of 
the strong spring force (with constant fc^ » fee ^ k^v^) 
prevents the system from accurately following the direc- 
tion of the negative eigenvalue. Indeed, upon forcing the 
bond potential Vbond with a random deformation, the 
configuration always bounce back to the minimum with- 
out reaching any saddle. We have thus chosen to perturb 
only one dihedral angle at a time, while the rest of the 
molecule is allowed to deform according to the potential 
in order to find the minimum energy compatible with the 
imposed dihedral. When a negative eigenvalue is found, 
it is followed with a deformation along the same direction 
while energy is still minimized in the orthogonal direc- 
tions, as in usual ART. This is the point that indeed fails 
if the random deformation is performed. 

Contrary to usual ART, our approach involves a finite 
amount of possible deformations, because only 2(iV — 3) 
possible changes of dihedral angles can be tried. This 
feature is not intended to promote a random diffusion in 
the space of minima, as in standard ART implementa- 
tions, but rather to implement a systematic protocol for 
the cataloging of all saddles and minima. More precisely 
we adopt the following procedure: starting from an ini- 
tial catalog with just one minimum, we try all possible 
activations from single dihedral deformations of that con- 
figuration. For each transition to another minimum, as 
before, we check whether the minimum is already in the 
catalog and eventually add it (and the same for the sad- 
dle). After all deformations for the first minimum have 
been attempted, we repeat the process from the second 
minimum of the catalog, and so on. The algorithm stops 
at the il/'th minimum if no new minima are found. At 
this point the catalogs of minima and saddles are consid- 
ered complete. One can view the whole process as an at- 
tempt of exact enumeration of minima and saddles. The 
choice of a finite set of activation moves, which might 
in principle lead to poor sampling of the configuration 
space, is justified in our case by the limited number of 
essential degrees of freedom (the — 3 dihedral angles) . 

Another delicate numerical issue is how to precisely 
follow the negative eigenvalue direction: if the negative 
eigenvalue direction is followed soon after it has been de- 
tected, the configuration could bounce back to the min- 
imum, and consequently one could miss a saddle. To 
avoid this problem, the dihedral deformation is contin- 
ued until the negative eigenvalue is lower than a small 
threshold e < 0. In principle this procedure might ex- 
ceedingly deform the configuration and lead to a saddle 
that does not belong to the basin of the minimum where 
one has started from. However, cross checks in the tran- 
sitions from and to minima indicate that this effect, if 
present, is very small. 

A substantial portion of the algorithm we used is based 
on software downloaded from N. Mousseau's web-page 
[25| . version 2006. This software is portable and allows 
for an easy replacement of the energy function, which 
was for Lennard-Jones clusters of atoms in origin. 
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IV. RENORMALIZATION 

As already noted in the introduction, the modehng of 
polymer dynamics as a hopping process relies on the ex- 
istence of a separation of time scales. The configuration 
space of many systems at sufficiently low temperature, 
like collapsed globules and glasses, can be partitioned 
into metastable states, regions whose internal sampling 
time is significantly lower than the escape time. In the 
traditional double-well picture it is straightforward to 
identify such regions as the basins of attraction of differ- 
ent local minima of the potential energy, and the crossing 
of the first-order saddle that separate them as the time- 
limiting step in the exploration of the landscape. This 
picture, however, fails to correctly reproduce the appro- 
priate division of time scales in more complex potential 
energy landscapes, with many minima separated by en- 
ergy barriers that span a wide range of energies. In these 
cases many minima of the potential energy are separated 
from their neighbors by saddles that are much lower than 
the available thermal energy and therefore fail to rep- 
resent effective kinetic bottlenecks. In these conditions 
metastable states do not consist anymore of single min- 
ima of the potential energy, but are instead composed of a 
collection of basins of attraction of different minima sep- 
arated by small activation energies. Figure [T] illustrates 
how the configuration space partitioning into metastable 
states depends on temperature: the shaded areas in the 
figure represent the regions mostly visited due to thermal 
agitation, while the rest of the landscape is only rarely 
explored. An energy barrier able to sufficient to dynami- 
cally isolate the basins of attraction of two different min- 
ima A and B at low temperatures might not represent a 
relevant kinetic barrier at higher temperatures anymore. 
Hence the two minima must be considered as belonging 
to the same metastable state AU B. Only in the zero- 
temperature limit a metastable state corresponds to the 
basin of attraction of a single local minimum of the po- 
tential energy. In this case the NMS can be easily deter- 
mined as the graph whose nodes are the potential energy 
minima and the links are first-order saddles connecting 
their basins of attractions. 

In order to determine the NMS at a given non-zero 
temperature we employ the recursive renormalization 
procedure proposed in (l6j . Its main ingredient is the 
single coalescence step illustrated in Fig. [2] if one of the 
energy barriers separating node 1 from node 2, W12 or 
W21, is smaller than the current temperature, the two 
nodes coalesce together forming a new node A. Between 
the possible paths from and to the new node, those with 
minimal energy barrier are kinetically the more relevant. 
We chose therefore to consider A as connected to the rest 
of the NMS only by means of the minimal energy con- 
nections. For example, considering node 3 in Fig. [2l the 
relevant connection would be Wa3 =min(VFi3, 1^23) and 
W3A =min(W3i, W32). 

The actual implementation of the algorithm is as fol- 
lows. First of all, we sort all connections in ascending 
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FIG. 1: Sketch of temperature dependence of the partitioning 
in metastable states in a three-well potential model. At low 
temperature the saddle separating the minima A and B is 
sufficiently high to provide a separation of time scales. When 
thermal agitation increase, the explored regions around the 
minima of the energy landscape (shaded areas) become wider. 

order according to their energy barrier. Then, starting 
from the first connection, we apply the following itera- 
tive procedure to all connections whose energy barrier is 
lower than the reference temperature: 

(i) The minimal energy node connected by the selected 
saddle is identified and its label replaces the label 
corresponding to the other node interested by the 
connection in the entire connections database. 

(m) The database of connections is searched for even- 
tual multiple connections between the same pairs 
of nodes. When such connections are found only 
the one characterized by the lowest energy barrier 
is kept and the other ones are erased. 

(Hi) The selected connection is finally erased from the 
database. 

This renormalization algorithm allows to determine the 
NMS at any given temperature starting from the zero- 
temperature NMS. As already stressed, the surviving 
nodes do not represent minima anymore but collections 
of minima that are best interpreted as metastable states. 

We notice that in general the renormalization process 
tends to increase the average connectivity of the NMS 
since the new node emerging from the coalescence of two 
neighboring nodes inherits all their connections. This ef- 
fect, however is partially mitigated by the presence of 



shared connections between the coalescing nodes. Defin- 
ing the average connectivity c as the average number of 
hnks per node and calhng k the number of shared con- 
nections between two coalescing nodes, it can be easily 
shown that c will decrease upon renormalization only if 
k> c~2. 

The large scale dynamics of the original molecular 
model is now suitably described by a diffusion on the 
NMS. This process is governed by a linear master equa- 
tion 



(a) 



dt 



(6) 



where Pi (i) is the probability of residing on the i-th node 
of the graph at time t, while is a temperature depen- 
dent Laplacian matrix whose elements are determined by 
the rates of transition between different nodes. 



N 



~ 



(7) 



k=l 



with Tjk being the probability per unit time of a transi- 
tion from node j to node k. 

If the NMS is globally connected, the Laplacian ma- 
trix has only one eigenvector with zero eigenvalue cor- 
responding to the stationary probability distribution on 
the graph. Moreover, it can be shown that the Laplacian 
matrix of the zero-temperature NMS is semi-positive- 
definite, and direct computation show that this feature 
is conserved by the renormalization procedure flG*! . As a 
consequence, Eq. ^ describes the relaxation to equilib- 
rium on the graph. 

By dropping the kinetic information Tjk one can define 
a discrete Laplacian matrix W given by 



if ^ 



if r,- 







(8) 



that can be interpreted as describing the process of re- 
laxation to equilibrium in a time unit corresponding to 
the number of jumps between different nodes. We will see 
later that the spectral properties of the discrete Laplacian 
matrix provide useful insights on the topological proper- 
ties of the NMS. 



V. THERMODYNAMICS 

In order to assign a clear quantitative meaning to the 
temperature scale in our model we determine the fold- 
ing transition and the G-transition temperatures of each 
sequence. The use of the term "transition" in this con- 
text must be clarified. The systems we study are far 
from the thermodynamic limit, so that no sharp thermo- 
dynamic transition occurs and a transition temperature 
is not rigorously defined. Nonetheless, a change in the 
thermodynamic behavior does occur in a relatively nar- 
row temperature range, so that it is customary to refer 



Wl2 





(c) 



W3. 



FIG. 2: Coalescence of two nodes. If one of the energy barriers 
separating 1 from 2, 14^12 or W21, is smaller than the reference 
temperature, the two nodes coalesce together forming a new 
node A that is connected to node 3 by a single energy barrier, 
which is min(M^i3, W23) to go from A to 3, and min(VK3i , W32) 
to come back from 3 to A. 



to a "0-transition temperature" and a "folding temper- 
ature" even for short polymers. When the transitional 
phenomenon under investigation has a sharp counterpart 
in the thermodynamic limit (as is the case with the 0- 
transition) the finite-size transition temperature is de- 
termined using methods which would give the correct re- 
sult in the thermodynamic limit. The case of the folding 
transition is different because it does not have a thermo- 
dynamic limit counterpart [26l.[27|. and various methods 
have been proposed to give a definition of a folding tem- 
perature, which typically yield comparable results p^ . 

For convenience we measure the temperature in mul- 
tiples of kg, which is then formally set as k^ = 1. The 
folding transition was primarily determined by using the 
"50% criterion" : the system at its folding temperature 
has equal probability of residing in the native structure 
as outside of it. As far as the 0-transition temperature is 
concerned, one should in principle be able to easily deter- 
mine it by the presence of maxima of either the specific 
heat Cv or of the derivative of the gyration radius with 
temperature, R' = dxR- An estimate of these quanti- 
ties can be easily computed by expanding the potential 
energy at the second order in the fixed points. The par- 
tition function of the system can then be expressed as a 
sum over all minima: 



-0Vi 



(9) 



where Vi is the potential energy of minimum i and Qi 
is the product of non-zero eigenfrequencies of the rela- 
tive Hessian Hi- If we translate this in a local entropy 
Si = —kghifli, the probability of residing in the i-th 
minimum can be written as a function of the local free 
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energy 

p.m - ^f-''"""^''^ (10) 

For a given configuration-dependent quantity S, the av- 
erage value on the landscape can then be expressed as 

(S(/3))=5]S,p,(/3). (11) 

i 

In this contest quantities such as the specific heat = 
d{E)/dT and the average gyration radius (mean square 
distance of the chain elements form their center of mass) 
can be computed once the configurations corresponding 
to each minimum are known. The quality of this ap- 
proximation has been thoroughly tested for several rough 
energy landscapes [1, [oj showing, as far as protein mod- 
els are concerned, a reasonable accuracy at temperatures 
comparable with the folding and 8-transitions psj . 

Clearly the definition of the folding transition based 
on the 50% criterion suffers of eventual ambiguities in 
the determination of the native configuration. The ho- 
mopolymers that we model tend to be affected by this 
problem. Due to the fact that the energy of homopoly- 
mers is invariant upon chain-reversal, the minima of the 
potential have a symmetric image, excluding those for 
which the two symmetric images coincide. We find that 
the probability that the absolute minimum has a non- 
symmetric configuration are very high and in the se- 
quences analyzed only the hydrophobic homopolymer of 
length iV = 10 does not fall in this category. In all other 
cases energy has two symmetric global minima, and they 
must both be used in order to correctly implement the 
50% criterion described above. Analogously, due to the 
finite size of the systems under study, also the determi- 
nation of the 0-transition temperature might be strongly 
affected by the parameter used to define it [l^. In this 
model the two consensus choices, and R' , give very 
similar indications. More precisely, for each sequence 
analyzed, the specific heat shows two maxima, and the 
lower one in temperature always coincides with Tf as 
determined by means of the 50% criterion. R' might 
instead show either two maxima, or a pronounced min- 
imum and a maximum, the minimum always appearing 
at lower temperature than the maximum. The high tem- 
perature maximum almost coincide with the second max- 
imum of the specific heat, thus reinforcing the interpre- 
tation of the latter as a sign of the 0-transition. Also 
the first minimum/maximum always occurs at the same 
temperature as the first peak of the specific heat, which, 
as we already noted, corresponds to Tf. Fig. [3] shows 
this agreement between and R' for the hydrophobic 
homopolymer of length 12, while Fig.[5]shows an example 
of negative R' peak at Tf occurring for the unstructured 
heteropolymer. 

An inspection of the gyration radii of individual min- 
ima shows that a negative R' peak appears when the na- 
tive minimum does not have the minimal gyration radius. 
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FIG. 3: Specific heat (circles) and derivative of the gyra- 
tion radius (squares) as a function of temperature for the 
hydrophobic homopolymer of length 12. To ease comparison 
both quantities are normalized to their maximum value. 




FIG. 4: Specific heat (circles) and derivative of the gyration 
radius (squares) as a function of temperature for the unsta- 
ble heteropolymer. To ease comparison both quantities are 
normalized to their maximum value. 



As a consequence, the gyration radius might decrease 
as soon as the system starts exploring other configura- 
tions, eventually increasing again when higher temper- 
atures force it towards swollen conformations. In both 
cases the presence of either a minimum or a maximum 
in R' amounts to a change in convexity of the R ver- 
sus temperature and, once again, signals that the system 
is undergoing a significant change in its configurational 
trends. 

The a-helix-like heteropolymer shows instead only one 
peak in the specific heat. This thermodynamic behavior 
is akin to what observed in similar systems [2^ , where no 
molten globule state could be detected. In those cases Tq 
and Tf coincide. Therefore, this sequence is structurally 
very stable, its native configuration being dominant even 
at temperatures almost as high as the temperature of 
thermal unfolding. Also in this case the derivative of the 
gyration radius shows a minimum corresponding to the 
folding transition, since the native state is an elongated 
helical structure while other minima have a smaller gy- 
ration radius. 
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FIG. 5: Specific iieat (circles) and derivative of the gyration 
radius (squares) as a function of temperature for the stable 
heteropolymer. To ease comparison both quantities are nor- 
malized to maximum of their absolute value. 

In order to illustrate the agreement between the dif- 
ferent criteria for the determination of Tf in table |T] wc 
report the folding temperature as computed by means of 
the 50% criterion and relying on the first maximum of 
the specific heat. Tq is also reported. It is interesting to 
notice that the ratio between Tf and Tq, which might be 
considered as a relative measure of the stability of their 
native structure, is approximately the same (4-^-5) for the 
homopolymer of length 12 and the unstable folder. 

This table also reveals an interesting feature of the 
thermodynamics of homopolymers and its dependence on 
the system size: while Tq grows with the polymer length 
(hydrophobic compaction is modelled by two-body inter- 
actions whose effect grows with the chain length), Tf does 
not show any well defined trend and oscillates around a 
fixed value Tf. The value of the temperature Tf ~ 1.1 
determined by using probabilities is slightly lower than 
the value Tj^ ~ 1.4 determined by using the specific heat. 

In order to show that the energy landscape of ho- 
mopolymers of different length has basically the same 
shape apart from a scaling factor iV^, in Fig. [S] we report 
the histogram p{Vm) of the rescaled potential V — VN^'^ 
in each minimum. The histograms of the homopolymers 





Tf,o (n) 


Tf,i 


Te 


Stablel2 


3.95 (1) 


4.2 


4.2 


Unstablel2 


0.21 (1) 


0.33 


1.5 


Homol2 


1.25 (2) 


1.6 


6.5 


Homo 11 


1.1 (2) 


1.4 


4.2 


HomolO 


0.6 (1) 


0.6 


3.1 


Homo09 


1.3 (2) 


2.0 


2.5 


HomoOS 


1.0 (2) 


1.0 


0.9 



TABLE I: Folding and B-transition temperatures for the se- 
quences under study. Folding temperatures are determined by 
means of the 50% criterion (T/^o) and relying on the first peak 
of the specific heat (T/^i). Near T/^o we report in brackets the 
number n of minima used to determine it. 
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FIG. 6: Histogram of the rescaled potential V of the minima 
of the potential energy for the homopolymers of lengths TV — 
10, 11, 12. The rescaling factor is A'^"'^. The circles radius is 
proportional to TV. 

of lengths = 10, 11, 12 collapse onto the same curve. It 
must however be stressed that, since the Lcnnard- Jones 
potential is short-range, the number of units that can 
possibly interact with a given monomer is limited. As 
a consequence, for large systems we expect the energy 
to scale linearly with the chain length N. The almost 
perfect ~ A^^ scaling observed signals therefore that the 
systems studied are still very small, their linear dimen- 
sion being comparable to the Lennard-Jones range. 

In order to clarify the other observed trend, the inde- 
pendence of Tf on systems size, we preliminary observe 
that increasing the system size leads to a fast increase of 
the landscape roughness especially if measured in terms 
of number of minima and saddles. Figure [7| shows that, 
as expected in these systems, A^min grows exponentially 
with the chain length TV. The same holds for the number 
of saddles A'sad but with a different algebraic correction. 
Indeed, the ratio between these two quantities, which 
corresponds to half the average connectivity of the NMS, 
grows linearly with TV (see inset in Fig. [7]). 

It must now be noted that not only the number of 
minima increases with system size but also the total vol- 
ume of the available configurational space does so. It 
is thus reasonable to expect that their ratio, the aver- 
age volume of the basin of attraction of each minimum, 
will also experience an exponential dependence from TV. 
The logarithm of the volume of an attraction basin cor- 
responds to its entropy. As already mentioned, this can 
be estimated according to a second order approximation 
in the local energy minimum: 

3Ar-6 

S^-ksYl l°g(^f^) (12) 
fc 

where the to^'^'^ 's are the 3TV — 6 non-zero eigenfrequen- 
cies — i.e., square root of the eigenvalues — of the Hes- 
sian matrix in the minimum i. In Fig. [5] we report the 
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FIG. 7: Number of minima A'min (circles) and saddles A'sad 
(squares) for increasing chain length A*'. In the inset the ratio 
Nsad/Ninin IS reported together with a power law fit (exponent 
1.07). 
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FIG. 8: Histogram of the rescaled entropy S = S/N'^-'^^ of 
the minima of the potential energy for the homopolymers of 
lengths N = 10, 11, 12. The symbol size is proportional to A^. 



histograms of the entropies for homopolymers of various 
length after rescaling by a factor N^-^-^^. The rescal- 
ing factor was chosen in order to force the coUapse of all 
histograms onto a single curve and witnesses an approx- 
imately linear dependence of the average basin entropy 
on chain length. 

As far as the stability of the folding temperature is con- 
cerned, we recall that T/ indicates the point where the 
free energy of the native configuration is approached also 
by other minima. An increase of the system size would 
therefore produce two counteracting effects on Tf. On 
the one hand the average steepness of the landscape will 
increase, thus increasing the energetic gap between the 
native configuration and its neighbors, on the other hand 
the total number of minima will also quickly increase, 
compacting minima in the configuration space and con- 
sequently in energy. 

Finally we note that the distribution p{Vs) of the first- 
order saddles energies Vs shows a dependency on system 
size very similar to that of the energy of the minima 



P(W) 




FIG. 9: Histogram of energy barriers W between connected 
minima for homopolymers with A'^ = 10, 11, 12. The symbol 
size increases with TV. Inset: distribution of energy barriers 
for the unstable folder (squares), stable folder (circles) and 
homopolymers with JV = 12 (crosses). 



(data not shown). It is therefore tempting to picture the 
effect of increasing chain lengths as a simple isometric 
stretching of the energy landscape, as could be attained 
by multiplying energy by a constant factor. This picture 
apparently clashes with the observation that the profiles 
of the histograms of the energy barrier heights W are 
pretty similar for all chain lengths, and consistent with 
the same exponential function (see Fig. [9]). Also this 
effect, however, can be accounted for by the quick growth 
of competing minima. The appearance of new minima 
helps breaking conformational transitions in more sub- 
steps, thus counteracting the general quadratic increase 
of their energetic cost. 



VI. THE HYDROPHOBIC HOMOPOLYMERS: 
TOPOLOGY AS A FUNCTION OF 
TEMPERATURE AND SYSTEM SIZE 

The thermodynamic properties analyzed so far only 
depend on the distribution of the minima of the poten- 
tial energy. In order to account for the trends observed in 
the folding and 0-transition temperature we were led to 
analyze in some detail the chain length dependence of the 
energy landscape both in terms of minima and connec- 
tion saddles. Since potential energy minima correspond 
to nodes of the zero-temperature NMS and saddles cor- 
respond to its links, the results of the previous analysis 
might be rephrased in terms of an exponential growth of 
the graph degree with system size and a linear growth of 
its average connectivity. We will now extend this analysis 
to the finite-temperature NMS. More precisely we will fo- 
cus on the combined effects of temperature and systems 
size on the topology of the NMS for homopolymers. 

The first quantity affected by the renormalization pro- 
cess is obviously the graph size. The total number of 
nodes for each homopolymer is reported in Table |lT] for 
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various temperatures corresponding to fixed multiples of 
Tf (the average folding temperature for homopolymers, 
determined using the peak in the specific heat) namely 
0.5Ty, Tf and 1-57/. Graph sizes appear to decay ex- 
ponentially with temperature, with a decay rate growing 
approximately linearly with the chain length. A similar, 
yet more irregular, decay can be observed in the number 
of links of the renormalized graph reported in table IIIII 

In order to understand whether the process of node 
coalescence that causes the NMS to shrink with temper- 
ature proceeds uniformly on the NMS, we analyze the 
temperature dependence of the multiplicities rrii of each 
node, i.e., the number of nodes of the zero-temperature 
graph that coalesced into the i-th node. In Fig. [10] we re- 
port the histograms of rrii for the iV = 12 homopolymer 
at various temperatures. The isolated column develop- 
ing on the right of the figure represents the multiplicity of 
the minimal energy node that is growing at the expense 
of the nodes with intermediate multiplicity (m^ > 10). 
The low multiplicity nodes are almost unaffected. These 
observations suggest that low saddles are highly localized 
around the bottom of the energy landscape. Concerning 
the latter point, we recall that two nodes coalesce when 
they are separated by a link corresponding to a saddle of 
energy lower than the current temperature. If the ma- 
jority of nodes tend to coalesce on the minimal energy 
node, it means that this node is, at every temperature, 
the one characterized by the lowest energy connections. 
Furthermore, the fact that isolated nodes of intermediate 
multiplicity, i.e., isolated regions of the landscape charac- 
terized by the presence of low energy saddles, end up very 
early in the minimal energy node reinforces the above 
picture on the localization of low energy saddles. 

The presence of a unique metastable state around and 
above Tf implies that, at these temperatures, long ho- 



iV 


r = 0.0 


r = 0.7 


r = 1.4 


T = 2.1 


12 


11120 


5276 


2684 


1372 


11 


3313 


1754 


976 


532 


10 


999 


564 
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67 


51 
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TABLE II: Graph size during renormalization at four different 
temperatures, for homopolymers of different lengths. 



FIG. 10: Histograms of the multiplicities rm of each node 
of the renormahzed graph for the A'' = 12 homopolymer at 
temperatures T = 0.7,1.4,2.1. Increasing temperatures are 
represented by lines of increasing thickness. 



mopolymers are characterized by the presence of an ex- 
tended, fast-connected, low-energy region where local re- 
arrangements take place more quickly than in the rest of 
the landscape. 

The opposite scenario emerges when looking at shorter 
chains where the fraction of low multiplicity nodes de- 
creases. For example in Table HVl we report, for the vari- 
ous homopolymers, the fraction of nodes with multiplic- 
ity 1 at different temperatures. This is the fraction of 
nodes still untouched by renormalization and, therefore, 
separated from the rest of the NMS by steep activation 
barriers. Table IIVI shows that short homopolymers are 
more affected by renormalization and loose most of their 
nodes of multiplicity 1. This phenomenon might be due 
to two slightly different mechanisms: either the low en- 
ergy node directly attracts low multiplicity nodes or the 
renormalization process affects in the same manner nodes 
of all multiplicities. In both cases the previous argu- 
ment about the spatial localization of low energy barri- 
ers near the native minimum falls and the landscape can 
no longer be divided in slow and fast regions. Remark- 
ably, the difference in the spatial distribution of energy 
barriers on the landscape described above arises indepen- 
dently from the actual probability distribution of barrier 
heights p{Vs) (see Fig. [9]) which, we recall, is pretty much 
size- independent . 



TV 


r = 0.0 


r = 0.7 


r = 1.4 


T = 2.1 




N 


T = 0.7 


r = 1.4 


T = 2.1 


12 


73468 


50802 


17374 


5700 




12 


0.67 


0.65 


0.67 


11 


20392 


14974 


7752 


2456 




11 


0.67 


0.63 


0.65 


10 


5710 


4236 


2920 


1600 




10 


0.70 


0.62 


0.59 


9 


1562 


1280 


1130 


794 




9 


0.70 


0.65 


0.61 


8 


482 


364 


298 


210 




8 


0.65 


0.52 


0.41 



TABLE III: Total number of connections during renormaliza- 
tion at four different temperatures for five hydrophobic ho- 
mopolymers of different lengths. 



TABLE IV: Fraction of the nodes of the renormalized NMS 
having multiplicity equal to one. Data refer to hydrophobic 
homopolymers of different lengths at various temperatures. 
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Finally, to document the relative importance of the 
fast-connected central regions in the different sequences 
analyzed, we report in Table fVl the number of minima of 
the potential attracted by the maximally growing node 
in the graph. This almost always coincides with the 
minimal energy node except for four low temperature 
cases dubbed with a star in the table. A comparison at 
the same temperature between homopolymers of differ- 
ent lengths shows that the fraction of minima attracted 
by the minimal energy node (in parenthesis in the table) 
grows with the chain length. It can be observed that 
by adding the number of nodes at a given temperature 
(Table HI)) to the number of minima coalesced to the min- 
imal energy node one gets a much higher proportion of 
the total number of minima for long chains than for short 
ones. For example, at T = 2.1 the = 12 polymer has 
8828 minima in the minimal energy node and 1372 nodes 
surviving on the graph. The latter nodes were much less 
affected by the merging process, with only about a de- 
crease of about 10% of their number. In the iV = 10 case 
the decrease is instead of 35%, once again implying that 
coalescence is more uniformly distributed in this graph. 

In the previous section we documented the linear in- 
crease of the average connectivity of the graph with the 
system size. At finite temperatures this quantity displays 
a more complex behavior, as shown in Table lVTl where the 
average connectivity c of the renormalized graphs of the 
five homopolymers is reported. After an initial growth 
the longer chains show a decrease of the average con- 
nectivity with temperature while the two shorter ones 
exhibit a growing connectivity up to T = 2.1. The tem- 
perature of peak connectivity diminishes with the chain 
length. 

Figure [TT] shows the effect of temperature on the dis- 
tribution of the connectivities Ui for the homopolymer 
of chain length 12 and 10 (inset). While the total num- 
ber of connections is almost invariant upon renormal- 
ization for the shorter systems, the longer one shows a 
remarkable loss of connectivity above rii = 10. In both 
cases the growth of an isolated, high connectivity region 
is observed. A closer inspection reveals that this region 
corresponds to the minimal energy node that, while at- 
tracting its neighbors, gradually acquires new connec- 



iV 


r = o.7 


T = 1.4 


r = 2.1 


12 


453 (0.04) 


5908 (0.53) 


8828 (0.79) 


11 


52 (0.02) * 


1043 (0.31) 


2324 (0.70) 


10 


38 (0.04) * 


117 (0.18) 


429 (0.43) 


9 


6 (0.03) * 


18 (0.06) 


57 (0.19) 


8 


4 (0.04) * 


7 (0.06) 


21 (0.20) 



TABLE V: Multiplicity of the maximally growing node during 
renormalization at three different temperatures for homopoly- 
mers of different lengths. When the maximally growing node 
does not correspond to the minimal energy node data are la- 
beled with a star. The fraction of minima coalesced to the 
minimal energy node is reported within parentheses. 
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FIG. 11: Histograms of the number rii of contacts of each 
node for the = 12 and A'^ = 10 (inset) homopolymers at 
temperatures T = 0.0,0.7,1.4,2.1 and T = 0.0,2.1 respec- 
tively. Increasing temperatures are represented by lines of 
increasing thickness. 



tions. It is thus tempting to ascribe the observed dif- 
ferences in average connectivity to the different rates of 
coalescence to this node previously observed in long and 
short chains. We remind that the average connectivity 
c decreases upon renormalization only if there are many 
shared connections between the coalescing nodes. The 
average connectivity increases significantly during the 
fast initial growth of the "central" node, signaling that 
each coalescing node is bringing new connections. As a 
consequence the ramifications of the "central" node are 
expanding in areas of the NMS which were previously rel- 
atively remote in terms of connections. After that phase 
a decrease in connectivity begins, showing that any new 
node attracted by the fastest growing node brings only a 
few new connections: the central Node has now a direct 
link to almost every portion of the NMS. 

We now describe the spectral properties of the NMSs 
of the sequence studied. More specifically we investigate 
their spectral dimension because it determines the large 
scale diffusivity on the graph. The spectral dimension 
describes the spectral density of the discrete Laplacian 
matrix for small eigenvalues. Since small eigenvalues cor- 
respond to long relaxation paths, high spectral dimen- 
sions imply a relaxation to equilibrium that requires a 
long series of jumps between nodes. Hence, although 
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r = o.7 


T = 1.4 


T = 2.1 


12 


13.2 


19.3 


12.9 


8.4 


11 


12.2 


17.0 


15.9 


9.2 


10 


11.4 


15.0 


16.4 


14.0 


9 


10.4 


13.0 


14.2 


15.0 


8 


9.2 


10.8 


11.6 


12.0 



TABLE VI: The average connectivity c for five hydrophobic 
homopolymers of different lengths at various temperatures. 
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FIG. 12: Rank-to-eigenvalue plot for the Laplacian matrix 
of the hydrophobic homopolymer of length 9 at increasing 
temperature: T = 0.0, 0.7, 1.4, 2.1. The circles radius is 
proportional to temperature. Dashed lines represent least- 
square power-law fits to the data. 



lacking any direct kinetic information, the spectral di- 
mension provides information about the dynamics of the 
system. In Fig. [T^] we illustrate the numerical procedure 
for the determination of the spectral dimension for the 
iV = 9 homopolymer at various temperatures. After a 
numerical diagonalization of the Laplacian matrix of the 
NMS the resulting eigenvalues are rank ordered. The 
resulting rank-to-eigenvalue curve is proportional to the 
integrated spectral density and can be fitted with a power 
law in order to extract the spectral dimension d. It must 
be stressed that one can rigorously assign a topological 
meaning to d only for asymptotically large graphs: sev- 
eral invariant properties of the spectral dimension, such 
as invariance under local link rewiring, only hold in the 
limit of infinite size. One might thus try to estimate the 
effects of the finite size of the analyzed graphs by quanti- 
fying the amount of such non-invariance. Before evaluat- 
ing d we therefore altered each NMS by adding random 
links between second neighbors (non-directly connected 
nodes connected to each other by a third node). The 
procedure was repeated 50 times generating 50 different 
realizations of each NMS and 50 different estimates of 
d (see Fig. [T3] where a sample of the rank-to-eigenvalue 
plots obtained after random rewirings is reported). Since 
this quantity should be invariant by link addition the 
standard deviation of the values obtained with this pro- 
cedure provides a measure of the numerical error in its 
estimate. The resulting averages (rf) are reported in Ta- 
ble IVIII The associated statistical errors range from 3% 
to 5%. 

The spectral dimension grows for the homopolymers 
more quickly than the dimension of the associated con- 
figurational space. On the other hand, at least for 
short polymers, it does not appreciably change with tem- 
perature coherently with the expected isospectrality of 
the renormalization procedure. For longer polymers the 
spectral dimension estimate d is affected by an unex- 



FIG. 13: Rank-to-eigenvalue plot for the Laplacian matrix 
of the hydrophobic homopolymer of length 12 for different 
random rewirings. 



pected spectral feature: at a certain temperature the 
Laplacian spectral density peaks around A = 1. It can 
be shown that these eigenvalues are due to nodes con- 
nected to more than one leaf node (by leaf we mean a 
node with only one link) . Initially the presence of a node 
connected to two or more leafs is rare but, as renormal- 
ization goes on, it tends to increase. The appearance 
of A = 1 eigenvalues is therefore enhanced by a high 
coalescence rates. As shown above, long homopolymers 
are indeed characterized by the fast coalescing region of 
the lowest-energy minimum, which explains the early ap- 
pearance of the mark of multiple leaf nodes in their spec- 
tra. This phenomenon is depicted in Fig. [14] where the 
rank-to-eigenvalue plot for the = 12 at various tem- 
perature is reported. Ranked data have been multiplied 
by an arbitrary factor to ease reading. At low tempera- 
tures the curves are pretty much invariant and spectral 
density is conserved. At T = 1.4, however, a step ap- 
pears in the curve at A = 1 (corresponding to a Dirac 5 
peak in the spectral density coming from the abundance 
of leaf nodes). Notably a steep drop in the spectral di- 
mension can be observed at the same time and persist at 
higher temperatures while the step at A = 1 grows. Table 
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4.2 


4.6 
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4.4 


8 


3.2 


3.0 
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TABLE VII: Average spectral dimensions of the five ho- 
mopolymers for different chain lengths at different temper- 
atures. Data marked with a star refer to spectra with a sig- 
nificant number of A=l eigenvalues. Missing data refer to 
cases where not enough points where available below A=l to 
allow a reliable fitting. 
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FIG. 14: Rank-to-eigenvalue plot for the Laplacian matrix 
of the hydrophobic homopolymer of length 12 at increasing 
temperature: T=0.0, 0.7, 1.4, 2.1. Larger symbols correspond 
to higher temperature. Rank data were multiplied for an 
arbitrary factor to ease reading. Dashed lines represent least- 
square power-law fits to the scale invariant portion of the 



IVIIl however, shows that, although the appearance of leaf 
nodes — labeled by stars in the table — is often associ- 
ated with changes in the spectral dimension, the latter 
do not share a clearly defined direction. In some cases 
{N = 11) J increases after the appearance of A = 1 eigen- 
values while in others (A^ = 12) it decreases. Moreover, 
in a few other cases the spectral dimension could not be 
estimated because not enough data where available below 
A=l (to allow a reliable fit), and fitting above this value 
does not seem correct since rank-to-eigenvalue curves ap- 
pear either to quickly lose any scale-invariant character 
for A = 1 or to display a markedly different exponent 
(see the T — 2.1 curve in fig [14]). Such anomalies there- 
fore suggest that the procedure here highlighted for the 
computation of the spectral dimension looses meaning in 
presence of pronounced peaks in the spectral density. 



VII. THE HETEROPOLYMERS: TOPOLOGY 
AS A FUNCTION OF SEQUENCE 

The effect of the primary sequences on the topologi- 
cal properties of the NMS is now determined by investi- 
gating the two heteropolymers with 12 residues and by 
comparing the results with those for to the = 12 ho- 
mopolymer. 

A first fundamental difference between the stable het- 
eropolymer and the other sequences arises already at 
T — when considering the size of the NMS, which at 
this temperature simply corresponds to the number of 
minima of the potential energy. The strong energetic bias 
and the relative lack of frustration of this system result in 
a much smaller quantity of minima, 4128, than in the case 
of the weakly biased unstable heteropolymer which has 
instead 13394 minima. The latter is much more similar 
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FIG. 15: Rank-to-eigenvalue plot for the Laplacian matrix 
of unstable folder (squares), stable folder (circles) and ho- 
mopolymer with A'' = 12 at T = 0. Continuous lines represent 
least-square power-law fits to the data. 



to the homopolymer which, we recall, has 11120 minima. 
The lower number of possible kinetic traps to be over- 
come by the stable heteropolymer already suggests its 
higher folding propensity. On the contrary the picture 
provided by considering NMS connectivities is much less 
clear, since the average number of connecting saddles for 
each minimum is 16.0 for the stable heteropolymer and 
27.4 for the unstable one while it was 13.2 for the ho- 
mopolymer of equal length. No particular correlation 
with structural stability can be therefore observed. 

This scenario is further emphasized by a topologi- 
cal difference between the zero-temperature NMS of the 
stable heteropolymer and the other sequences. While 
the spectral dimension of the unstable heteropolymer is 
comparable to that of the homopolymer of equal length 
{d = 16.6 ± 0.9 in the first case and d = 16.4 ± 0.9 in 
the latter), the stable folder is characterized by a much 
smaller spectral dimension: d = 10.3 ± 0.6. These dif- 
ferences are preserved also at higher temperatures since, 
also in this case, spectral dimensions keep substantially 
unchanged until the appearance of A = 1 eigenvalues. We 
recall that the unstable heteropolymer and the = 12 
homopolymer also share a very similar ratio Tf/T^. The 
spectral dimension, a topological quantity that describes 
relaxation dynamics, does therefore correlate with the 
latter ratio, which is a thermodynamical quantity. 

This finding relates with analogous evidence [iSj for a 
two-dimensional toy model where a difference in the spec- 
tral dimension between heteropolymers and homopoly- 
mers was detected. In that case, however, the spectral 
dimension was not found to depend on the stability of 
the native structure but rather on the amount of het- 
erogeneity in the sequence. Different sequences display- 
ing different folding behaviors and landscape steepness 
but the same ratio of polar to hydrophobic beads shared 
the same spectral dimension. Moreover, such a depen- 
dency only appeared in that model for finite-temperature 
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NMSs, while here it is evident already at T = 0. 

The two heteropolymers analyzed here share the same 
fast decay of the tails of the energy barriers distribu- 
tion seen for the homopolymer (see inset in Fig. [9]). In 
the heteropolymer case the decay is slightly faster and 
leads to a substantial cutoff at K — 80 for the unsta- 
ble folder. A far more fundamental difference, however, 
arises at small energies. The unstable folder differs from 
the two other sequences in having a much higher fraction 
of low barriers, Vs < 10. Since this is the region most 
affected by the renormalization procedure, it is reason- 
able to expect a different behavior of this sequence under 
renormalization. Indeed, the comparison of the topolog- 
ical changes induced by the renormalization process on 
the NMSs of polymers characterized by different primary 
sequences proves very instructive about the origin of the 
above mentioned leaf node phenomenon. It is not obvious 
which physical meaning to assign to the gradual trans- 
formation of the NMS in a stellar graph characterized by 
a center surrounded by leaf nodes. It is also equally non- 
intuitive what relation might this process have with the 
thermodynamics of the system. The analysis of the spec- 
tra of the sequences analyzed (see Table |VlTT]) shows that 
A = 1 eigenvalues always become dominant at Tq imply- 
ing that at that temperature leaf nodes become the only 
relevant feature of the NMS. In fact, at Tq the graph has 
already shrunk in size by more than one order of magni- 
tude and almost all of the missing nodes appear to have 
merged to the minimal energy node, which seems thus to 
be the hot-spot of renormalization also for heteropoly- 
mers. 

Analyzing the same indicators at lower temperatures 
(columns referring to T = in Table IVTII)) shows, how- 
ever, a marked difference in the progress of the appear- 
ance of leaf nodes against other topological transforma- 
tions of the NMS. While at T = the minimal energy 
node has eaten up already all the available configuration 
space, still the relative amount of A = 1 is significantly 
smaller than the value it takes at Tq. The appearance 
of leaf nodes seems therefore to be intimately connected 
with the 0-transition in these sequences, depicting the 
moment in which all the configuration space becomes 
equally accessible. In this condition the metastable states 
still persisting as separate nodes are very unlikely to have 
multiple connections because all of their neighbors al- 
ready collapsed on the minimal energy node. As a conse- 
quence these nodes, which originally corresponded to the 
frontier of the low temperature NMS, have a large prob- 
ability of becoming a leaf node shortly before coalescing 
themselves to the minimal energy node. 

The pace of this process is strongly sequence- 
dependent and does not show any clear correlation with 
the stability of the native structure. It nonetheless ap- 
pears to start at markedly low temperatures for ho- 
mopolymers. 

Furthermore, we remember that the the appearance of 
leaf nodes also depends on chain length. Shorter poly- 
mers have a less marked frequency of A = 1 eigenval- 



ues also in the proximity of To (Table IVTI)) . It is there- 
fore worth stressing the differences in the 0-transition in 
short and long polymers. Regardless of chain length, the 
0-transition signals the temperature at which all configu- 
ration space becomes equally accessible. When this hap- 
pens in short polymers, the landscape is still divided into 
different metastable states or, equivalently, there are still 
kinetic barriers to be overcome to access the whole config- 
uration space. On the contrary, in long polymers, at the 
0-transition almost all the configuration space belongs 
to the same stationary state and can be quickly sampled 
without crossing kinetically relevant barriers. This does 
not mean that large barriers are absent from the energy 
landscape, on the contrary they are as common as in 
short polymers. Their arrangement, however, is not ca- 
pable of dividing the landscape into kinetically separate 
metastable states. 



VIII. CONCLUSIONS 

We analyzed the thermodynamics as well as some 
metric and topological properties of the NMS of short 
homopolymers and heteropolymers in a coarse-grained 
off-lattice protein model. The homopolymers were 
hydrophobic in nature and characterized by different 
lengths, while the two heteropolymers were chosen in or- 
der to ensure a marked difference in structural stability. 

The systems investigated exhibit a variety of thermo- 
dynamical behaviors with respect to folding. While some 
sequences show all three classical protein conformational 
arrangements -including the molten globule- others pass 
directly from folded to swollen, thus reproducing the ten- 
dency of stable folders to have high Tf. The model con- 
sidered is therefore able of efficiently mimicking the dif- 
ferences of folding propensities observed in real protein 
sequences. 

We have employed a recently proposed procedure [iGj 
to generate the NMS at a finite temperature T, based 
on the merging of nodes separated by energy barriers 
lower than T. This renormalization procedure allows 
to determine a partition of the configuration space into 
dynamically separated regions. We have thus studied 
the temperature dependence of the NMS characteristic 
of each sequence. The analysis of the topological prop- 
erties of the NMS of homopolymers of different lengths 
highlights how the statistical distribution of activation 
energies is not the only factor determining the behavior 
of the graph under renormalization: the spatial distribu- 
tion of the barriers also plays a crucial role. Indeed, al- 
though investigated homopolymers share the same distri- 
bution of energy barriers, their topology respond pretty 
differently to an increase in temperature. While in long 
sequences the renormalization procedure mainly alters 
the low energy region of the landscape where metastable 
states merge into a unique, rapidly expanding macro- 
state, in shorter sequences this growth is more uniformly 
distributed and other, more distant metastable states 
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surviving 


multiplicity of 


eigenvalues 




nodes 


largest node 


with A = 1 




rp _ 2Te 
a 


T = Te 




T = Te 




T = Te 


Stablel2 


155 


73 


0.94 


0.97 


0.52 


0.74 


Unstablel2 


402 


170 


0.96 


0.99 


0.37 


0.91 


Homo 12 


109 


54 


0.99 


0.99 


0.88 


0.93 



TABLE VIII: Number of surviving nodes, fraction of nodes in the minimal energy node and fraction of eigenvalues A = 1 for 
the three polymers of length 12 during renormalization. 



might experience a substantial expansion. These differ- 
ences in the renormalization process are reflected in an 
early (low T) tendency to create leaf nodes in the NMS of 
long sequences, both for homopolymers and heteropoly- 
mers. This phenomenon is independent on the stability 
of the native structures of the sequences analyzed and 
completes around the 0-temperature, where the config- 
uration space mostly belongs to same metastable state. 

The spectral dimension of the same sequence at dif- 
ferent temperatures is approximately constant, at least 
before leaf nodes appear. Given the same length, the 
spectral dimension appears to be higher for the sequences 
characterized by unstable native structures. This find- 
ing suggests a non-obvious link between thermodynam- 
ics and the dynamical properties of these systems. Se- 
quences characterized by a stable folded state appear to 
be characterized by shorter and less complicated relax- 
ation paths. Indeed, the stable heteropolymer is charac- 
terized by simpler connectivity also in terms of the sheer 
number of nodes, which, compared to the other = 12 
sequences, ranges from one half {T = 0) to one order of 
magnitude less (Tq). 

Interestingly some of the features described above seem 
to hold also for a simpler two-dimensional model investi- 



gated in [l6|. Also in that model the spatial distribution 
of the barriers is crucial in shaping some properties of the 
energy landscape, namely the amount of kinetic traps. 
However, in that case only kinetic properties of the NMS 
were affected and not its topology. On the contrary, in 
the model here investigated topology is strongly influ- 
enced by the spatial distribution of energy barriers. Fi- 
nally, in both models the spectral dimension helps in cat- 
egorizing sequences according to the complexity of their 
relaxation paths. In the two-dimensional model, how- 
ever, such a difference only arises at finite temperature 
and is related to sequence frustration rather than to the 
stability of the native structure. 
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